Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
Sustainability ; 15(9):7097, 2023.
Article in English | ProQuest Central | ID: covidwho-2312751

ABSTRACT

Real-world applications often involve imbalanced datasets, which have different distributions of examples across various classes. When building a system that requires a high accuracy, the performance of the classifiers is crucial. However, imbalanced datasets can lead to a poor classification performance and conventional techniques, such as synthetic minority oversampling technique. As a result, this study proposed a balance between the datasets using adversarial learning methods such as generative adversarial networks. The model evaluated the effect of data augmentation on both the balanced and imbalanced datasets. The study evaluated the classification performance on three different datasets and applied data augmentation techniques to generate the synthetic data for the minority class. Before the augmentation, a decision tree was applied to identify the classification accuracy of all three datasets. The obtained classification accuracies were 79.9%, 94.1%, and 72.6%. A decision tree was used to evaluate the performance of the data augmentation, and the results showed that the proposed model achieved an accuracy of 82.7%, 95.7%, and 76% on a highly imbalanced dataset. This study demonstrates the potential of using data augmentation to improve the classification performance in imbalanced datasets.

2.
12th International Conference on Electrical and Computer Engineering, ICECE 2022 ; : 248-251, 2022.
Article in English | Scopus | ID: covidwho-2290742

ABSTRACT

Right at the end of 2019, the world saw an outbreak of a new type of SARS (severe acute respiratory syndrome) disease, SARS-Cov-2, or COVID-19. Even in 2022, around 1 million people worldwide are getting infected with the virus every day. To date, more than 6 million people have died as a result of the virus. To tackle the pandemic, the first step is to successfully detect the virus among the mass population. The most popular method is the RT-PCR test, which, unfortunately, is not always conclusive. The physicians thus suggest lung CT tests for the patients for clinical relevance. But the problem with lung CT scans for the detection of coronavirus is that the COVID-19 infected scan is very similar to community-affected pneumonia (CAP) infected scan, and the results in many cases get wrongly interpreted. In addition, the virus is always mutating into different strains, and the severity and infection pattern slightly change with each mutation. Because of this rapid mutation, a large and balanced dataset of lung CT scans is not always available. In this work, we systematically evaluate the accuracy of a deep 3D convolutional neural network (CNN) on a small-scale and highly imbalanced dataset of lung CT scans (the SPGC COVID 2021 dataset). Our experiments show that it can outperform previous state-of-the-art 3D CNN models with proper regularization, an appropriate number of dense layers, and a weighted loss function. Our research, therefore, suggests an effective solution for identifying COVID-19 in lung CT scans using deep learning for small and highly imbalanced datasets. © 2022 IEEE.

3.
5th International Conference on Networking, Information Systems and Security, NISS 2022 ; 2022.
Article in English | Scopus | ID: covidwho-2300967

ABSTRACT

One of which machine learning data processing problems is imbalanced classes. Imbalanced classes could potentially cause bias towards the majority classes due to the nature of machine learning algorithms that presume that the object cardinality in classes is around similar number. Oversampling or generating new objects in minority class are common approaches for balancing the dataset. In text oversampling method, semantic meaning loses often occur when deep learning algorithms are used. We propose synonym-based text generation for restructuring the imbalanced COVID-19 online-news dataset. Three deep learning models (MLP, CNN, and LSTM) using TF/IDF and word embedding (WE) feature are tested with the original and balanced dataset. The results indicate that the balance condition of the dataset and the use of text representative features affect the performance of the deep learning model. Using balanced data and deep learning models with WE greatly affect the classification significantly higher performances as high as 4%, 5%, and 6% in accuracy, precision, recall, and f1-score, respectively. © 2022 IEEE.

4.
2nd International Conference on Advanced Network Technologies and Intelligent Computing, ANTIC 2022 ; 1798 CCIS:3-15, 2023.
Article in English | Scopus | ID: covidwho-2258989

ABSTRACT

The COVID-19 pandemic places additional constraints on hospitals and medical services. Understanding the period for support requirements for COVID-19 infected admitted to hospitals is critical for resource distribution planning in hospitals, particularly in resource-reserved settings. Machine Learning techniques are being used to approximate a patient's duration of stay in the hospital. This research uses Decision Tree, Random Forest and K-Nearest Neighbors, Voting classifiers, and Stacking classifiers to predict patients' length of stay in the hospital. Due to the imbalance in the dataset, Adaptive Synthetic (ADASYN) was used to resolve the issue, and the permutation feature importance method was employed to find the feature importance scores in identifying important features during the models' development process. The proposed "ADASEML” has shown superior performance to the earlier works, with an accuracy of 80%, precision of 78%, and recall of 80%. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

5.
2022 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies, 3ICT 2022 ; : 721-727, 2022.
Article in English | Scopus | ID: covidwho-2213129

ABSTRACT

Machine learning for Covid-19 diagnosis from blood tests is a topical problem. Many studies of this problem are mainly devoted to comparing various algorithms' efficiency. However, the first and often the most critical part of machine learning is the preparation of a relevant and correct dataset of the required size for developing the generalization models. This study demonstrates the lack of the models' generalization performance based on some publicly available datasets. That leads to the futility of such models in practice even if they were developed using the best algorithms and achieved high metrics. Therefore, another dataset is proposed. Its features are discussed. This dataset splits into training and testing sets by stratification due to an imbalanced data structure. Machine learning models of the problem by various algorithms are developed based on the proposed dataset. The modelling results on the testing set have demonstrated that the best models - Gradient Boosting Classifier with fixing imbalance methods SMOTE and ADASYN, TensorFlow and Gene Expression Programming - handle negative Covid-19 diagnosis well enough since they have high precision and high recall. However, mixed signals have been obtained for a positive Covid-19 diagnosis. TensorFlow and Gene Expression Programming models have high precision and relatively low recall for positive Covid-19 diagnosing. It means these models can't detect Covid-19 well enough but are highly reliable when they do. Gradient Boosting Classifier models do not have enough high precision and recall for positive Covid-19 diagnosing. New challenges of machine learning for Covid-19 diagnosis based on blood tests are found for future work. © 2022 IEEE.

6.
Journal of Advances in Information Technology ; 13(5):530-538, 2022.
Article in English | Scopus | ID: covidwho-2056413

ABSTRACT

COVID-19 (coronavirus disease) has spread worldwide and has become a pandemic, which causes by the SARS-CoV2 virus. Because the number of cases increases daily, interpreting the laboratory findings takes time, resulting in limitations of findings. Because of these limitations, the need for a clinical decision-making system with predictive algorithms has arisen. By identifying diseases, predictive algorithms would be able to reduce the strain on healthcare systems. In this work, we developed clinical predictive models using machine learning techniques with the help SMOTE+ENN Hybrid technique and laboratory data to develop models that can accurately predict which patients will receive COVID-19. To evaluate our prediction models in this work, precision, F1-score, recall AUC, and Accuracy evaluation metrics are employed. From 600 patients and 10 laboratory findings, the different models are tested and validated with 10-fold cross-validation and holdout cross-validation approaches. The experimental results show that our predictive models can correctly identify patients with COVID-19 with an accuracy of 98.28%, an F1-score of 98.27%, a precision of 98.23%, a recall of 98.32%, and an AUC of 98.32% in the holdout cross-validation approach, and an accuracy of 97.42%, and F1-score of 97.82%, a precision of 97.63%, a recall of 98.05%, and an AUC of 92.66% in 10-fold cross-validation approach. The results of the experiments showed that all machine learning models in the holdout cross-validation approach outperformed the 10-fold cross-validation approach. Finally, to help medical experts with accurately prioritizing resources, predictive models based on laboratory findings have been discovered that can assist in predicting COVID-19 infection and assisting medical professionals to identify which medical resources are most valuable. © 2022 J. Adv. Inf. Technol.

7.
8th NAFOSTED Conference on Information and Computer Science, NICS 2021 ; : 398-403, 2021.
Article in English | Scopus | ID: covidwho-1774680

ABSTRACT

The pandemic of COVID-19 is expansion and effect for human lives all over the world. Although many countries have been vaccinated, the number of new COVID-19 patients infected is still increasing. Recently, the detection of COVID-19 early can help find effective treatment plans using machine learning technologies algorithms. We propose the transfer learning models to detect pneumonia disease by this virus from chest X-Ray images. The public dataset is used in this work, and the new chest X-Ray images of COVID-19 patients are collected by An Giang Regional General Hospital. These images enrich the current public dataset and improve the performance prediction. Six transfer learning architectures are investigated using locally collected and public dataset. The experiment results show that the DenseNet121 transfer learning model outperforms others with the accuracy, precision, recall, F1-scores, and AUC of 98.51%, 98.54%, 98.51%, 98.05% and 99.15%, respectively on the augmented dataset and most algorithms process new data are improved performance. © 2021 IEEE.

8.
Comput Electr Eng ; 100: 107971, 2022 May.
Article in English | MEDLINE | ID: covidwho-1773226

ABSTRACT

The coronavirus pandemic has affected people all over the world and posed a great challenge to international health systems. To aid early detection of coronavirus disease-2019 (COVID-19), this study proposes a real-time detection system based on the Internet of Things framework. The system collects real-time data from users to determine potential coronavirus cases, analyses treatment responses for people who have been treated, and accurately collects and analyses the datasets. Artificial intelligence-based algorithms are an alternative decision-making solution to extract valuable information from clinical data. This study develops a deep learning optimisation system that can work with imbalanced datasets to improve the classification of patients. A synthetic minority oversampling technique is applied to solve the problem of imbalance, and a recursive feature elimination algorithm is used to determine the most effective features. After data balance and extraction of features, the data are split into training and testing sets for validating all models. The experimental predictive results indicate good stability and compatibility of the models with the data, providing maximum accuracy of 98% and precision of 97%. Finally, the developed models are demonstrated to handle data bias and achieve high classification accuracy for patients with COVID-19. The findings of this study may be useful for healthcare organisations to properly prioritise assets.

9.
20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021 ; : 1299-1306, 2021.
Article in English | Scopus | ID: covidwho-1741207

ABSTRACT

COVID-19-related pneumonia requires different modalities of Intensive Care Unit (ICU) interventions at different times to facilitate breathing, depending on severity progression. The ability for clinical staff to predict how patients admitted to hospital will require more or less ICU treatment on a daily basis is critical to ICU management. For real datasets that are sparse and incomplete and where the most important state transitions (dismissal, death) are rare, a standard Hidden Markov Model (HMM) approach is insufficient, as it is prone to overfitting. In this paper we propose a more sophisticated ensemble-based approach that involves training multiple HMMs, each specialized in a subset of the state transitions, and then selecting the more plausible predictions either by selecting or combining the models. We have validated the approach on a live dataset of about 1, 000 patients from a partner hospital. Our results show that rare events, as well as the transitions to the most severe treatments outperform state of the art approaches. © 2021 IEEE.

10.
Intell Based Med ; 3: 100023, 2020 Dec.
Article in English | MEDLINE | ID: covidwho-957097

ABSTRACT

Almost every dataset these days continually faces the predicament of class imbalance. It is difficult to train classifiers on these types of data as they become biased towards a set of classes, hence leading to reduction in classifier performance. This setback is often tackled by the use of various over-sampling or under-sampling algorithms. But, the method which stood out of all the numerous algorithms was the Synthetic Minority Oversampling Technique (SMOTE). SMOTE generates synthetic samples of the minority class by oversampling each data-point by considering linear combinations of existing minority class neighbors. Each minority data sample generates an equal number of synthetic data. As the world is suffering from the plight of COVID-19 pandemic, the authors applied the idea to help boost the classifying performance whilst detecting this deadly virus. This paper presents a modified version of SMOTE known as Outlier-SMOTE wherein each data-point is oversampled with respect to its distance from other data-points. The data-point which is farther than the other data-points is given greater importance and is oversampled more than its counterparts. Outlier-SMOTE reduces the chances of overlapping of minority data samples which often occurs in the traditional SMOTE algorithm. This method is tested on five benchmark datasets and is eventually tested on a COVID-19 dataset. F-measure, Recall and Precision are used as principle metrics to evaluate the performance of the classifier as is the case for any class imbalanced data set. The proposed algorithm performs considerably better than the traditional SMOTE algorithm for the considered datasets.

SELECTION OF CITATIONS
SEARCH DETAIL